Extracting Nested Collocations
نویسندگان
چکیده
'l?his paper 1)rovidcs an at)l)roa(:h to tim semi-aul;onmtic exl;i'action of (:ollocaIJons f lom eorl)ora using sl;atisti(:s. The growing availability of lm'ge textual cort)ora, and the in(:reasing number of applications of colloeal;ion extra(:tion, has given risc~ 1;o wu.ious apt)roaches on the I;opi(:. In l;his palter, we address the probl(;m of 'ne,stcd collocrd, ions; thai, is, those being l)art of longer colloc;ttions. Most approa(:hes till now, tl'(!al;ed subst;rings of collo(:at;ions as eollocal;ions, only if they apl)eared ffequenl;ly enough 1)y l;hemselves in the cor[)llS. 'Fhese techniques le['l; ~r lot; of collocations mmxl;ra(:l;ed, in this 1)ai)er, we i)rol)oSe an algoril;hln for a semi-aul;oma|;ic exl;ra(;l;ion of nesl;ed uninl;errupl;ed anti inl;errul)l;ed collo(:al;iolls, paying parl;icular al;l;(~lll;ion to nested collocat;ion.
منابع مشابه
Retrieving Collocations by Co-occurrences and Word Order Constraints
In this paper, we describe a method for automatically retrieving collocations from large text corpora. This method retrieve collocations in the following stages: 1) extracting strings of characters as units of collocations 2) extracting recurrent combinations of strings in accordance with their word order in a corpus as collocations. Through the method, various range of collocations, especially...
متن کاملExtracting Verb-Noun Collocations from Text
In this paper, we describe a new method for extracting monolingual collocations. The method is based on statistical methods extracts. VN collocations from large textual corpora. Being able to extract a large number of collocations is very critical to machine translation and many other application. The method has an element of snowballing in it. Initially, one identifies a pattern that will prod...
متن کاملCollocational Translation Memory Extraction Based on Statistical and Linguistic Information
In this paper, we propose a new method for extracting bilingual collocations from a parallel corpus to provide phrasal translation memories. The method integrates statistical and linguistic information to achieve effective extraction of bilingual collocations. The linguistic information includes parts of speech, chunks, and clauses. The method involves first obtaining an extended list of Englis...
متن کاملRetrieving Domain-Specific Collocations by Co-occurrences and Word Order Constraints
In this paper, we describe a method for automatically retrieving collocations from large text corpora. This method comprises the following stages: (1) extracting strings of characters as units of collocations, and (2) extracting recurrent combinations of strings as collocations. Through this method, various types of domain-specific collocations can be retrieved simultaneously. This method is pr...
متن کاملExtracting Collocations from Text Corpora
A collocation is a habitual word combination. Collocational knowledge is essential for many tasks in natural language processing. We present a method for extracting collocations from text corpora. By comparison with the SUSANNE corpus, we show that both high precision and broad coverage can be achieved with our method. Finally, we describe an application of the automatically extracted collocati...
متن کامل